Inverted Index Support for Numeric Search

نویسندگان

Marcus Fontoura

Ronny Lempel

Runping Qi

Jason Y. Zien

چکیده

Today’s search engines are increasingly required to broaden their capabilities beyond free-text search. More complex features, such as supporting range constraints over numeric data, are becoming common; structured search over XML data will soon follow. This is particularly true in the enterprise search domain, where engines attempt to integrate data from the Web and corporate knowledge portals with data residing in proprietary databases. In this paper we extend previous schemes by which an inverted index based search engine can efficiently support queries that contain numeric restrictions in addition to standard, free-text portions. Furthermore, we analyze both the known schemes and our extensions in terms of index-build time, index space and query processing time. We show how to maximize query processing performance while respecting limits on index size and build time, or conversely, how to minimize index space and build time while maintaining guarantees on runtime performance. Thus, we concisely analyze the trade-off between index size and build time, and runtime performance. Finally, we present experimental results that demonstrate significant performance benefits attained by our method, as compared to alternative approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Generic Inverted Index Framework for Similarity Search on the GPU - Technical Report

Data variety, as one of the three Vs of the Big Data, is manifested by a growing number of complex data types such as documents, sequences, trees, graphs and high dimensional vectors. To perform similarity search on these data, existing works mainly choose to create customized indexes for different data types. Due to the diversity of customized indexes, it is hard to devise a general paralleliz...

متن کامل

Parallel Search Using Partitioned Inverted Files

We examine the search of partitioned inverted files with particular emphasis on issues that arise from different types of partitioning methods. Two types of index partitions are investigated: namely Termld and Docld. We describe the search operations implemented in order to support parallelism in probabilistic search. We also describe higher level features such as search topologies in parallel ...

متن کامل

Access-Ordered Indexes

Search engines are an essential tool for modern life. We use them to discover new information on diverse topics and to locate a wide range of resources. The search process in all practical search engines is supported by an inverted index structure that stores all search terms and their locations within the searchable document collection. Inverted indexes are highly optimised, and significant wo...

متن کامل

Inverted indexes: Types and techniques

There has been a s ubstantial amount of research on high performance inverted index because most web and search engines use an inverted index to execute queries. Documents are normally stored as lists of words, but inverted indexes invert this by storing for each word the list of documents that the word appears in, hence the name “inverted index”. This paper presents the crucial research findin...

متن کامل

3D Inverted Index with Cache Sharing for Web Search Engines

Web search engines achieve efficient performance by partitioning and replicating the indexing data structure used to support query processing. Current practice simply partitions and replicates the text collection on the set of cluster processors and then constructs in each processor an index data structure. This paper proposes a different approach by constructing an index data structure that pr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Internet Mathematics

دوره 3 شماره

صفحات -

تاریخ انتشار 2007

Inverted Index Support for Numeric Search

نویسندگان

چکیده

منابع مشابه

A Generic Inverted Index Framework for Similarity Search on the GPU - Technical Report

Parallel Search Using Partitioned Inverted Files

Access-Ordered Indexes

Inverted indexes: Types and techniques

3D Inverted Index with Cache Sharing for Web Search Engines

عنوان ژورنال:

اشتراک گذاری